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Noisy distance estimates associated with photometric rather than spectroscopic redshifts lead 
to a mis-estimate of the luminosities, and produce a correlated mis-estimate of the sizes. We 
consider a sample of early-type galaxies from the SDSS DR6 and apply the generalization 
of the Vmax method to correct for these biases. We show that our technique recovers the 
true redshift, magnitude and size distributions, as well as the true size-luminosity relation. 
Regardless the specific application outlined here, our method impacts a broader range of 
studies, when at least one distance-dependent quantity is involved. 

1 Introduction and significance 

Galaxy scaling relations play a crucial rule in constraining galaxy formation models. However, 
a bias will be intrinsically present in these correlations if the transformation from observable 
to physical quantity involves one or more distance-dependent observables, due to noise in the 
distance estimate. Distances are only known approximately if photometric redshifts are available 
but spectroscopic redshifts are not. This is already the case of many current surveys (e.g. SDSS, 
Combo-17, MUSYC, Cosmos), where the number of objects with photometric redshifts is more 
than an order of magnitude bigger than that of spectroscopic redshifts, and will be increasingly 
true of the next generations of deep multicolor photometric surveys (e.g. DES, LSST, SNAP). 

Therefore, methods for recovering unbiased estimates of distance-dependent observables, and 
of the joint distribution of luminosity, color, size, from magnitude limited photometric redshift 
datasets are indeed necessary (Rossi & Sheth 2007[ll Sheth 2007^21 Lima et al. 2008 3). In what 
follows, we show the essential inversion character of this class of problems by using a selected 
sample of early-type galaxies from the SDSS DR6 - for which both photo-zs and spectro-zs are 
known - and by applying our deconvolution techniques to reconstruct the true distributions and 
the scaling relations. 



2 The SDSS early-type sample 

The catalog we use is based on the Sloan Digital Sky Survey (SDSS) Data Release 6, avail- 
able online through the Catalog Archive Server Jobs System (CasJobs). We adopt selection 
criteria suitable to early-type galaxies. Specifically, from the DR6 galaxy photometric sample 
(PhotoObjAll in the Galaxy view), and from the spectroscopic sample (SpecObjAll), we select 
objects according to these general criteria: 



• Petrosian magnitudes in the range 14.50 <m< 17.45 for the r band; 

• Concentration index rpetro,9o/i'petro,50 > 2.5 in the i band; 

• Likehhood of the de Vaucouleur's model > 0.8; 

• Objects with both photometric and spectroscopic redshifts available. 

No redshift or velocity dispersion cuts were made. Our catalog contains 163, 718 objects, and 
consists of model magnitudes, petrosian radii, De Vaucouleurs and exponential fit scale radii 
along with their corresponding axis ratios in the r band, photometric redshifts and photo-z errors. 
We do not apply any K-corrections to our de-reddened model magnitudes, since our main goal 
is to test the deconvolution technique rather than characterize the exact relations. We select 
photometric redshifts from the SDSS Photoz Table. This set of photometric redshifts has been 
obtained with the template fitting method, which simply compares the expected colors of a 
galaxy with those observed for an individual galaxy (Budavari et al. 2000^^. The spectroscopic 
pipeline assigns instead a final redshift to each object spectrum by choosing the emission or cross- 
correlation redshift with the highest CL. In the selection of our sample, we tried to minimize 
the use of spectral information, but more robust constraints can be applied in order to reduce 
errors in galaxy classification. 

3 Essence of the deconvolution problem 

If we indicate with and z the photometric and spectroscopic redshifts, respectively, the problem 
of estimating the intrinsic redshift distribution N{z) - normalized number of objects which lie 
at redshift z - is best thought of as a deconvolution problem, and if p{C\z) is the probability of 
estimating the redshift as C when the true value is z, then the distribution of estimated redshifts 
is: 



Equation ([T|) is an integral equation of the first kind of the Fredholm type, with the conditional 
probability p{C\z) as kernel. A simple iterative scheme proposed by Lucy (1974)1^ allows one to 
reconstruct the intrinsic distribution after a few iterations, provided a suitable first guess. 

Similarily, let M denote the true absolute magnitude and Ai that estimated using ^ rather 
than z. Use Di{z) to denote the luminosity distance, and (j){M) to indicate the number density 
of galaxies with absolute magnitudes M. Let Fmax denote the largest comoving volume out of 
which an object of absolute magnitude M can be seen, and Vmin the analogous if the catalog is 
also limited at the lower end. The (true) number of galaxies with absolute magnitude M for a 
magnitude limited catalog is: 




(1) 



N{M) = 0(M)[y^ax(M) - K.m(M)] 



(2) 



and the total number of objects with estimated absolute magnitudes A4 is: 





where 
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Figure 1: [Left] Observed, intrinsic and reconstructed redshift distributions for the SDSS DR6 early-type sample. 
The dotted histogram was used as a starting guess for the one-dimensional deconvolution algorithm. Convergence 
is achieved after a few iterations. [Center] Reconstruction of the intrinsic N{M) distribution from the distribution 
of estimated redshifts. Dotted histogram shows the observed absolute magnitude distribution, used as a starting 
guess. Jagged line is the reconstructed intrinsic distribution, after 10 iterations. [Right] Reconstruction of the 
intrinsic N{R) distribution from the distribution of estimated redshifts. Dotted histogram shows the observed size 
distribution, used as a starting guess. Jagged lines show the reconstructed intrinsic distribution after 8 iterations. 



Note that sinc6 V^ax s-nd l^min kiiown fiinctioiis of A^, itself is just a complicated, function 
of M and 7W. Dividing (g]) by [Fmax(M) - Vmin(M)] yields: 

@{M,M) ^ f dFco^/dPL (M-M\Mn) 

= J dD^piD^) p{M -M\M,Di^) 
= JdDi^p{Di^)p{M\M,Dj^) 

= p{M\M). (5) 

Therefore, the observed magnitude distribution can be expressed as a simple one-dimensional 
deconvolution, namely: 

M{M) = f N{M) p{M\M) dM. (6) 



Along the same lines, use R to denote log^o of the physical size, and TZ to denote the 
estimated size based on the photometric redshift C- Then it is readily shown that one can also 
think of M(TZ) as being a convolution of the true number of objects with size R, 

M{TZ) = j N{R) p{n\R). (7) 

Direct measurements of the conditional probabilities allow one to reconstruct the intrinsic dis- 
tributions from the observed ones, using a simple one-dimensional deconvolution. Similarily, a 
two-dimensional extension of the previous formalism is necessary if scaling relations are recon- 
structed from photometric data (Rossi & Sheth 2007)1^. 

4 Redshift, magnitude and size distributions. Scaling relations 



Results of applying our deconvolution techniques to the observed redshift, magnitude and size 
distributions are shown in Figured) Specifically, the left panel shows the photometric or observed 
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Figure 2; EfTect of photo- 2: on the size-luminosity correlation in our SDSS early-type catalog. In the left panel, 
contours and solid line show the 71 — A4 relation associated with photo-zs, whereas the right panel shows the 
intrinsic R — M relation measured from spectro-zs. Note the bias (shallower slope in panel on left) which results 
from the fact that the photo-2 distance error moves points down and left or up and right on this plot. Squares in left 
panel show the binned starting guess for the 2d deconvolution algorithm, triangles in right panel show the result 
after 7 iterations. Circles are the expected binned intrinsic relation, obtained from spectroscopic information. 



redshift distribution (dotted line), the spectroscopic or intrinsic distribution (solid line) and its 
reconstruction after a few iterations (jagged line), based on the Lucy (1974)EI inversion algorithm. 
The p{C\z) distributions are inferred directly from the SDSS data, and in our deconvolution code 
(DeFaST) we use splines to interpolate for these conditional distributions. In the same fashion, 
by measuring the conditional probabilities p{A4\M) and p{TZ\R) directly from the catalog, it 
is possible to apply the one-dimensional deconvolution algorithm to reconstruct the magnitude 
and size distributions (equations E] and [T]) . The central panel shows the reconstruction (jagged 
line) of the intrinsic distribution of absolute magnitudes (solid histogram) after 10 iterations. 
The observed distribution of Ai (dotted line) was used as a convenient starting guess in the 
deconvolution algorithm. Similarity, the right panel shows the one-dimensional reconstruction 
(jagged line) of the size distribution. The intrinsic distribution of physical sizes (solid line) is 
recovered after a few iterations, when the observed distribution of TZ (dotted line) is used as a 
convenient starting guess. 

Although the difference between the intrinsic and observed size distributions is remarkably 
small, this departure suffices to bias the size- luminosity relation - as presented in Figure [2l In 
fact, photometric redshift errors broaden both the magnitude and size distributions, but changes 
to the estimated absolute magnitudes and sizes are clearly not independent. These correlated 
changes have a significant effect on the size-luminosity relation, even when the brodening of 
one of the two distributions is not severe. In our SDSS catalog {TZ\A4) oc —0.226, whereas 
{R\M) oc —0.257. In Figure [2] it is shown that the use of photo-z introduces a bias in the 
size-luminosity relation (shallower slope in panel on left). Squares in left panel show the binned 
starting guess for the two-dimensional deconvolution algorithm, triangles in right panel show 
the result after 7 iterations and circles are the expected binned intrinsic relation, obtained from 
spectroscopic information. Convergence to the true solution is clearly seen. 



5 Summary 



Using a selected sample of early-type galaxies from the SDSS DR6, for which both photo-zs 
and spectro-zs are known, we applied our one- and two-dimensional deconvolution techniques 
(Sheth 2007-2- Rossi & Sheth 2OO7II) to reconstruct the unbiased redshift, magnitude and size 
distributions, as well as the magnitude-size relation. We showed that our technique recovers 
all the true distributions and the joint relation, to a good degree of accuracy. We argued 
that the problem of reconstructing the true magnitude or size distribution is best thought as a 
one-dimensional deconvolution problem, and provided little algebra to show that this is indeed 
possible. We showed that even if the distribution of physical sizes is almost unbiased, a bias in 
the magnitude distribution sufficies to compromise the size-luminosity relation in an important 
way. We used our 2D technique to correct for this effect. 

Although the discussion was phrased mainly in terms of the luminosity-size relation, the 
methods developed here are quite general and can be applied to recover any intrinsic correlations 
between distance-dependent quantities (even for n-correlated variables). Potentially, they impact 
a broader range of studies when at least one distance-dependent quantity is involved. 
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